fix(code): unwedge stuck cloud message queues#2088
Draft
VojtechBartos wants to merge 1 commit intomainfrom
Draft
fix(code): unwedge stuck cloud message queues#2088VojtechBartos wants to merge 1 commit intomainfrom
VojtechBartos wants to merge 1 commit intomainfrom
Conversation
Cloud follow-ups got permanently stuck in the local queue when the SSE watcher exhausted its reconnect budget — Gate B in sendCloudPrompt queued the message but never restored the SSE stream, so no turn_complete ever arrived to drain. Two adjacent holes in Gate A and the cloudStatus handler could also strand a queue on a missed turn_complete. Each fix is a few lines, all in SessionService. Generated-By: PostHog Code Task-Id: 3de9f10b-b668-45c9-8688-eb94b3260be5
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Users on cloud runs report messages getting permanently stuck "queued", with no automatic recovery. Reproduced from a real session log: SSE streams dropped repeatedly with
error: 'terminated', the cloud-task watcher exhausted its 5-attempt reconnect budget, and every subsequent follow-up just sat in the queue. Bug regressed in #1905 (Gate B was added without an SSE-reconnect kick) and #2060 only patched the success path.What's going on
SessionService.sendCloudPrompthas three queue gates. Two of them could strand a queue forever:status !== "connected") — queued the message but never tried to bring the SSE stream back, so noturn_completeever arrived to trigger a drain. Fix: when status isdisconnectedorerror, fire-and-forgetretryCloudTaskWatch(taskId)(already used by the manual "Retry" button). This is the change that actually unwedges the user-reported scenario.cloudStatus !== "in_progress") — setisPromptPending: trueso the boot-time UI could show a spinner, but the flag was only cleared byturn_complete. A missedturn_completeleft the flag stuck andsendQueuedCloudMessages's own!isPromptPendingguard then blocked the drain. Fix: drop the eagerisPromptPending: truewrite. The flag now means only "an actual prompt is in flight."handleCloudTaskUpdatestatus branch — explicitly skipped auto-flush oncloudStatus → in_progressto avoid racing the agent's initialclientConnection.prompt(). That race only exists beforerun_startedflips status to"connected". Fix: if a status update within_progressarrives andsession.status === "connected"and the queue is non-empty, schedule a drain.sendQueuedCloudMessagesstill bails onisPromptPending, preserving the original race protection.Tests
5 new vitest cases covering each path; full code-app suite (1150 tests) passes.
Created with PostHog Code